-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add motion vectors support for animated meshes #9902
Conversation
33d1c9d
to
ba6ec25
Compare
dadb19f
to
81e04f5
Compare
81e04f5
to
cd39ce0
Compare
cd39ce0
to
6eb014a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think after this + #8974 we're good to move TAA out of the "experimental" module, thanks so much for doing this!
99ec6f7
to
6f71fd2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super keen on the complexity of the binding stuff, but it's too late in the day for me to try to see if I can think of something I would like more. :)
9e28cbd
to
52892b6
Compare
I'm pretty happy with what I came up with. If it's not a global minima in term of complexity, it's definitively a local minima. It's extremely shallow, it's basically function calls (a single layer of call stack, at it) and nothing else… I would recommend looking at the file without the diff, as the deleted code just makes things less clear. I'll add some comments to the code to make it easier to read. |
5f713d5
to
4ad75fe
Compare
I think we should move this to 0.13, and give it more time instead of trying to rush it through, and then test it with the other TAA changes for 0.13. |
For the record, I'm for moving this to 0.13. I think it makes the most change in the context of other TAA changes, which will be moved as well. |
ccf0911
to
075762e
Compare
Co-authored-by: Robert Swain <[email protected]>
075762e
to
f01ce08
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow the style of the rest of bevy with newlines between type and impl, and between bare/impl functions.
let weight_count = bevy_pbr::morph::layer_count(); | ||
for (var i: u32 = 0u; i < weight_count; i++) { | ||
let weight = bevy_pbr::morph::previous_weight_at(i); | ||
if weight != 0.0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't new, so not blocking, but I wonder if this branch is helpful or harmful. It probably has to run both branches regardless I would think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a really good question. My GPU-fu is not strong enough to be able to answer it. I don't remember the motivation behind it.
/// access the current buffer with [`DoubleBuffer::current`], | ||
/// and the previous one with [`DoubleBuffer::previous`]. | ||
#[derive(Default)] | ||
pub struct DoubleBuffer<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal for now, but I'd like to have some MultiBuffer<const N: usize, T>
where N
is the number of buffers that get rotated through. It is apparently common to use 3-4 buffers of dynamic per-frame data like mesh uniforms. This is because the GPU can have 2-3 frames in flight, meaning the rendering depends on the contents of those buffers, and if we want to update the contents, wgpu would have to wait until the target buffer is no longer in use by the GPU to be able to transfer the data to it. We can take that in a separate PR though, and just make DoubleBuffer<T>
into MultiBuffer<2, T>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a fun thing to implement. But I think it should be part of a different PR.
/// buffer, including the motion vector buffer. | ||
/// | ||
/// Used in the prepass pipeline. | ||
const VIEW_MOTION_VECTOR_PREPASS = (1 << 13); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want / need this anymore? I thought the changes to MOTION_VECTOR_PREPASS
were meant to indicate when a motion vector prepass is being/has been run and so motion vectors are present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still a distinction between "Does this view have a motion vector buffer" and "does this mesh support motion vectors" for the specific mesh being rendered.
if is_motion_vectors { | ||
// disable is_motion_vectors if this mesh doesn't have a previous value for | ||
// skins or morphs. This way, we get the shader variant without the bindings for | ||
// the previous value, which doesn't exist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be simplified with a dummy binding? That is, we bind a fallback buffer (like the fallback images), but we don't use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great. Though, at the cost of pushing the "is motion vector" info to a uniform, add the is_motion_vector flag to the uniform for each mesh, enable that uniform conditionally on MOTION_VECTOR_PREPASS
And not read the motion vector specific buffers when the bit is off.
This would allow getting rid of the combinatorial shader variants too, as we could set unconditionally the previous_joint_matrices/previous_morph_weight buffers.
Though no idea how much work & code that is.
if let Some(skin) = previous_skin { | ||
bind_group_entries.push(skinning_entry(4, skin)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if let Some(skin) = previous_skin { | |
bind_group_entries.push(skinning_entry(4, skin)); | |
if let Some(previous_skin) = previous_skin { | |
bind_group_entries.push(skinning_entry(4, previous_skin)); |
ShaderStages, TextureSampleType, TextureViewDimension, | ||
// --- Individual layout entries and individual bind group entries --- | ||
|
||
fn buffer_layout(binding: u32, size: u64, visibility: ShaderStages) -> BindGroupLayoutEntry { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @IceSentry is doing something similar to simplify construction of bind group layout entries and bind group entries.
previous_skin: Option<&Buffer>, | ||
previous_weights: Option<&Buffer>, | ||
) { | ||
let mut bind_group_entries = SmallVec::<[BindGroupEntry; 6]>::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what benefit SmallVec brings here other than overhead / an unnecessary dependency as this code would not exceed the array on the stack. I'd just use an array and an index variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have personally used ArrayVec
from the arrayvec
crate, but SmallVec
is already a dependency of bevy_pbr
, so why not use it?
The advantage compared to using an array and subslicing it is that we don't have to initialize zero-valued BindGroupEntry
(BindGroupEntry
does not implement Default
, so it would require a lot of boilerplate code). We could use [MaybeUninit<BindGroupEntry>; 6]
, but it would also require a bit of boilerplate and unsafe
unrelated to what we want to accomplish, and that's pretty much what SmallVec
already does for us regardless.
let layout = |bits| variant_layout(ActiveVariant::from_bits_truncate(bits), device); | ||
MeshLayouts(array::from_fn(layout)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. :)
let layout = |bits| variant_layout(ActiveVariant::from_bits_truncate(bits), device); | ||
MeshLayouts(array::from_fn(layout)) | ||
} | ||
// TODO: Passing 3 bools is pretty bad |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pass an ActiveVariant instead? .get(ActiveVariant::SKIN | ActiveVariant::MOTION_VECTORS)
reads fine to me.
Oh, you have that just below. Then, is this function needed?
/// access the current buffer with [`DoubleBuffer::current`], | ||
/// and the previous one with [`DoubleBuffer::previous`]. | ||
#[derive(Default)] | ||
pub struct DoubleBuffer<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be in bevy_render.
Needs rebasing, please |
It isn't worthwhile investing more time in this if I don't know what the way forward to get it merged is. |
morph targets. This is a revamped version of bevyengine#9902. Instead of adding more bind group layouts as that patch did, which created a combinatorial explosion of layouts, this patch unconditionally adds `prev_joint_matrices` and `prev_morph_weights` bindings to the shader bind groups. These add no significant overhead if unused because we simply bind dummy buffers, and the driver strips them out if unused. We already do this extensively with the various `StandardMaterial` bindings as well as the mesh view bindings, so this approach isn't anything new. The overall technique consists of double-buffering the joint matrix and morph weights buffers, as most of the previous attempts to solve this problem did. The process is generally straightforward. Note that, to avoid regressing the ability of mesh extraction, skin extraction, and morph target extraction to run in parallel, I had to add a new system to rendering, `set_mesh_motion_vector_flags`. The comment there explains the details; it generally runs very quickly. I've tested this with modified versions of the `animated_fox`, `morph_targets`, and `many_foxes` examples that add TAA, and the patch works. To avoid bloating those examples, I didn't add switches for TAA to them. Addresses points (1) and (2) of bevyengine#8423.
morph targets. This is a revamped version of bevyengine#9902. Instead of adding more bind group layouts as that patch did, which created a combinatorial explosion of layouts, this patch unconditionally adds `prev_joint_matrices` and `prev_morph_weights` bindings to the shader bind groups. These add no significant overhead if unused because we simply bind dummy buffers, and the driver strips them out if unused. We already do this extensively with the various `StandardMaterial` bindings as well as the mesh view bindings, so this approach isn't anything new. The overall technique consists of double-buffering the joint matrix and morph weights buffers, as most of the previous attempts to solve this problem did. The process is generally straightforward. Note that, to avoid regressing the ability of mesh extraction, skin extraction, and morph target extraction to run in parallel, I had to add a new system to rendering, `set_mesh_motion_vector_flags`. The comment there explains the details; it generally runs very quickly. I've tested this with modified versions of the `animated_fox`, `morph_targets`, and `many_foxes` examples that add TAA, and the patch works. To avoid bloating those examples, I didn't add switches for TAA to them. Addresses points (1) and (2) of bevyengine#8423.
#13572 is being merged, which tackles this same work. |
…orph targets. (#13572) This is a revamped equivalent to #9902, though it shares none of the code. It handles all special cases that I've tested correctly. The overall technique consists of double-buffering the joint matrix and morph weights buffers, as most of the previous attempts to solve this problem did. The process is generally straightforward. Note that, to avoid regressing the ability of mesh extraction, skin extraction, and morph target extraction to run in parallel, I had to add a new system to rendering, `set_mesh_motion_vector_flags`. The comment there explains the details; it generally runs very quickly. I've tested this with modified versions of the `animated_fox`, `morph_targets`, and `many_foxes` examples that add TAA, and the patch works. To avoid bloating those examples, I didn't add switches for TAA to them. Addresses points (1) and (2) of #8423. ## Changelog ### Fixed * Motion vectors, and therefore TAA, are now supported for meshes with skins and/or morph targets.
Objective
bevy_motion_vec_anim_3-2023-09-22.mp4
Solution
DoubleBufferVec
, a thin wrapper around twoBufferVec
BufferVec
used inSkinUniform
andMorphUniform
by the double buffer.buffer.clear()
instead of clearing the current buffer, swap the buffers and then clear the 2-frames-behind bufferMeshPipelineKey::MOTION_VECTOR_PREPASS
indicates that the mesh supports motion vectorsMeshPipelineKey::VIEW_MOTION_VECTOR_PREPASS
. It is technically meaningless outside of the prepass pipeline, but it avoids widening the key, and therefore slow down processing.The other changes are the shader code for handling the previous values (which is mostly copy/pasting, sadly); And some attempt at copping with the combinatorial explosion of shader variants in
mesh_bindings.rs
. (we have the following variables: with/out morph targets, with/out skinning with/out motion vectors)Note to reviewers
mesh_bindings.rs
Following is a description of the
mesh_bindings.rs
file.Entry definition
We first define each individual bind group (layout) entries as functions that accept a binding index and an additional parameter for the binding resource when relevant, and return the layout/bind group entry.
Those are the
FOOBAR_layout
andFOOBAR_entry
functions.Combination functions
Then we have
variant_layout
andMeshBindGroupBuilder::add_variant
.Those take a
ActiveVariant
1 (or build one), create an empty buffer (SmallVec
) and progressively add the entries based on what flag is active in theActiveVariant
. Then we create theBindGroupLayout
orBindGroup
(respectively) with all the gathered entries.MeshBindGroups
MeshBindGroups
is a collection ofBindGroup
s.To add bind groups to it, you would first create a
MeshBindGroupBuilder
usingMeshBindGroups::bind_group_builder
. Then, calladd_variant
for eachBindGroup
that will be needed for rendering.MeshBindGroups
has twoHashMap
, oneshared
that contains bind groups that can be shared between different entities with the same set of shader features (ActiveVariant
elements).The other
HashMap
isdistinct
, which contain bind groups that are unique per-mesh. This is currently only relevant for meshes with morph targets. (Morph targets have a texture binding, therefore need to bind to a different individual texture for each individual mesh).The idea of two hashmaps, one for shared and another for distinct bind groups is taken from #10235.
In bevy,
prepare_mesh_bind_group
is where variants are added toMeshBindGroups
.MeshLayouts
MeshLayouts
is to layout whatMeshBindGroups
is to bind groups. Here, however, since there is exactly one layout per set of shader feature, we can pre-compute every possible layouts. This is what occurs inMeshLayouts::new
, which is called in theFromWorld
impl ofMeshPipeline
. Then, we can just get an existing layout by usingMeshLayouts::get
, insetup_morph_and_skinning_defs
; AndMeshLayouts::get_variant
inadd_variant
(to get the corresponding layout for the bind group).This approach is a shameless clone of the #10156 approach. Unlike with
MeshBindGroups
, we don't even need to enumerate every features. By usingarray::from_fn::<ActiveVariant::COUNT>
, we automatically get an enumeration of all possible (and more) shader feature combinations.ActiveVariant
A downside of the
bitflags
crate is that it conflates carelessly a set of FOO and FOO, it makes it harder to talk about the bitflag struct.An
ActiveVariant
is a set of shader features, a feature may be (1) skinning, (2) morph targets, (3) motion vectors, (N) more in the future.Since a shader is composed of 0-to-N features, a single
ActiveVariant
represents a single instance of a shader.MeshPipelineKey::VIEW_MOTION_VECTOR_PREPASS
There are two problems with a double buffer approach:
current
buffer is filled, there is no such thing as theprevious
buffer pageprevious
buffer with the current frame's indices is bogus.To solve (1) we need to disable motion vectors, not for the whole view, but for individual meshes. Meshes which skinning or morphing shader are enabled, but didn't exist last frame (their
previous
buffer page is empty) can't have aprevious
buffer, therefore shouldn't have the binding for the previous buffer set.This is where we introduce
MeshPipelineKey::VIEW_MOTION_VECTOR_PREPASS
. It is a distinct value thanMeshPipelineKey::MOTION_VECTOR_PREPASS
. The view still has access to the motion vector buffer, but the mesh/instance doesn't have aprevious
buffer to compute the motion vector with. So we need to have two different key flags to handle this.To solve (2), we make the double buffer generic over the buffer, and we allow to use it for
EntityHashMap
, now we can keep around the indices for the previous frame as well as the buffer.Note that we add special logic so that when motion vectors aren't used, we don't double buffer anything, so that no one pays the cost (memory cost, as it has no runtime cost) of the double-buffer when not needing it.
Tricky bits
The code changes outside of
mesh_bindings.rs
sure seem small (evendouble_buffer.rs
is 62 lines). But it was actually extremely complex to get right. Bind groups that depend on runtime values such as "does the double buffer have a 'previous' page?" Are really tricky to get right.MeshPipelineKey
is more or less a "cache" for the world state, but this cache is set in several places. I often got the infamous wgpu error, and had no way to tell where my errors where. I had to patiently add printlns a bit everywhere, glare at the logs and try to think what explained them, compound that with how many rendering functionalities I had to rebase through, it was very difficult.Merging this PR will probably move the burden of handling this to more people, and we should prioritize making that kind of bugs impossible in the future.
Testing
I used this modified version of
shader_prepass.rs
to test the change set: https://gist.github.com/nicopap/bf97174f1496df4854909d672eb2cc0f. Note that it is still missing a morph target + skinned mesh (I tested a morph+skin model locally and seemed to work).I tried to compare how this affects ghosting with TAA, but I failed to reproduce ghosting artifacts on main. The motion vectors do look correct though.
Future work
Migration guide
SetMeshBindGroup
now requires an additional boolean const parameter, set it tofalse
to keep the old behavior,true
if you want it to handle motion vectors.MeshBindGroups
'sreset
method is nowclear
.MeshLayouts
has been revamped. To get bind groups from the layouts, useMeshBindGroups::bind_group_builder
, or access manually the layouts.Changelog
Footnotes
ActiveVariant
is a "set of shader features", see relevant section ↩